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ABSTRACT 


Since the end of cold war, predicting a nation state’s 
instability has been a challenging national security issue 


for the United States. This thesis presents several 








methods to predict the conflict potential for failed nation 
states by comparing their social, economic, political, and 
military statistics with those in the past. This study 


uses the Brier scoring rule to evaluate the performances of 








these probability prediction methods. The study provides 





insights into situations where one method expects to 











outperform the others. 
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For the United States, 


EXECUTIVE 


SUMMARY 


nation states’ instabilities, 


or internal conflicts, have accounted for a significant 


internal conflicts, 











international interests, and in 


To mitigate the negative effect 


conflict, 


occurrences, of where and when 


it is 


proportion of military operations. These instabilities, or 





result in human tragedies, clashing of 


disturbance of global peace. 


of a nation’s internal 


essential to have knowledge, prior to their 


these conflicts may happen. 





In the last few decades, predic 


tion of internal conflicts 


of nation states has been an ongoing effort. 


This thesis extends a recent study conducted by LTC 


Robert Shearer from 
modificat 


in this 








tions and extensions to 


the Center for Army Analysis. Several 


Shearer’s work are proposed 





thesis. First, more careful treatment to missing 


data and data set rescaling is given. Second, rather than 


project 





year, 


directly predict the conflic 





ting a nation state’s various statistics for the next 


their current year stat 





tistics are utilized to 


t potential. Third, when 





computing a probability prediction, a weighted average is 


used instead of an arithmetic average. 


With 


Analysis, 


the data set provided 











evalua 


This st 





CES 








their performance by 


by the Center for Army 


this study experiments with proposed methods and 





the Brier scoring rule. 





tudy provides insights in 





to situations where one 


method expects to outperform the others. 
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Ds INTRODUCTION 


This thesis proposes several methods to predict the 





probability that, in the near future, a nation will develop 





an internal conflict. In 2007, the Center for Army 





Analysis (CAA) conducted a study on various statistics of a 














nation state as they relate to internal conflicts [1]. The 





importance of predicting the internal conflict of a nation 


is evident in a quote from this study: 


Since the end of cold war, economic dislocations, 
Civil war, famine, and ancient ethnic and 
religious animosities have contributed to 
conflict and political instability in states 
extending from Haiti to the vast archipelago of 
Indonesia. These conflict and instabilities 
frequently challenge national security interests; 
at other times, the human rights atrocities that 
often accompany these dislocations offend the 
moral imperatives of individual states as well as 
the international community [2]. Increasingly, 
the United State’s military has found itself 
executing a range of operations as a direct 
result of these conflicts. Understanding where 
and when these conflicts could occur is essential 
in developing a sound military strategic plan 


[1]. 



























































In this CAA study, Shearer [1] identified a vector of 





13 features that correlate with whether an internal 
conflict will occur in anation. These 13 features can be 


put into four categories as follows: 





e Economic features: male unemployment, GDP per 
capita, and trade openness. 


e Military features: conflict history. 





° Political features: civil liberties, democracy, 
and political rights. 





e Social features: adult male literacy, caloric 
intake, ethnic diversity, infant mortality rate, 
life expectancy, and religious diversity. 








The method used in Shearer’s study can be described 


briefly as follows: 























Stepl: Put each country-year into a 13 dimensional 
space based on the 13 feature values. If the 
country has internal conflict in that year, 
paint the point red; otherwise, paint it 
blue. 

Step2: Use a weighted moving average on the 13 
feature values to project a country’s 
movement in this space in the future. 

Step3: Predict, based on the projected location of 








a country in the future and the colors of 
its neighbors, whether that country will 

develop an internal conflict in the given 
year. 











Shearer’s method consists of extending statistical 


extrapolations indefinitely. This is a significant 





strength: for each prediction, his method allows choice of 


how far into the future one wants. 


Shearer’s [1] method, however, has room for 


improvement. First, in step 1, a point’s color is painted 





based on whether, for the same year as the 13 feature 


values, a country has internal conflict. Yet the actual 





problem is to use feature values of the current year to 


predict conflict in the next year. Second, in step 3, the 








neighbors’ colors are weighted equally without considering 





distance between neighbors and the location under 





consideration. This thesis explores possible extensions to 








Shearer’s method [1] to improve prediction results. 


A. RESEARCH OBJECTIVE 


This thesis’ objective is to develop new methods to 
improve the quality of predictions from Shearer’s study. 
The most significant differences of the new methods include 
the following: 


e In step 1, points are painted red or blue based 
on whether the country has internal conflict in 
the next year. There is no need to project the 
movement of a country on the 13-dimensional map. 
Rather than forecasting the future features, the 
13 feature values from the previous years are 
utilized to directly predict the future’s 
conflict outcome. 














e In step 3, to predict the probability of internal 
conflict, the distance between the location under 
consideration and its neighbors are taken into 
account. 














To compare this study’s methods with those studied by 








Shearer [1], the same data set that Shearer used, which 


contained 13 macro-structural features of 155 nations 





observed from 1993 to 2003, is used. Further, the Brier 











scoring rule is utilized to evaluate each prediction method 





and to compare their performances. 


B. LITERATURE REVIEW 


Organizations and scholars studied the previous works 





about interstate instability or conflict. Each of these 














works represents a unique contribution to the development 
of conflict prediction or crisis early warning. According 
to O’Brien [2], one research conducted by the State Failure 


Task Force (SFTF) that had used logistic regression, neural 








networks, and genetic algorithm with some key featur 





variables associated with nation’s internal instability to 


provide an early warning of state failures. 


i) 


O’Brien [2] built on SFTF’s study to forecast the 








conflict by presenting a macro-structural approach. 


O’Brien’s results suggest that predictions for countries 





experiencing a certain level of intensity can be 


accomplished based on their similarities. Shearer [1] also 





presented a macro-structural approach to predict the 
conflict potential of nations. The O’Brien and Shearer 
works are Similar. However, they used different pattern 


classification algorithms. O’Brien used fuzzy analysis of 











statistical evidence (FASE); whereas, Shearer used Nearest 











Centroid (NC) and K-Nearest-Neighbor (KNN) algorithms. Both 





results can forecast nations’ levels of conflict, out to 5 


years, with about 80% overall accuracy. 








There are other interesting related studies on 
interstate instability or conflict. For example, Beck, 
King, and Zeng [3] presented a version of a neural network 
model that revealed interesting structural features of 


international conflict. Pevehouse [4] discovered that 





increased trade dependence can stimulate conflict 








Simultaneously. Kilgour and Zagare [5] used a discrete 
game model to analyze a problem of limited conflict. Robst, 
Polachek, and Chang [6] showed how geography and trade work 


influence international conflict. These approaches and 








studies can generate forecasts that provide the strategic 
decision maker with good knowledge of when and where a 


nation will likely experience instability or conflict. 








This thesis can be viewed as an extension to the works 


by O’Brien [2] and Shearer [1]. Further, it uses macro- 





structural factors to predict a nation’s future conflict 








level. The goal is to explore other prediction methods to 





provide a good mechanism to support the decision maker 


prior to the occurrences of future conflicts. 


Cc. THESIS ORGANIZATION 


After this introductory chapter, there are three 








chapters remaining in this thesis. Chapter II discusses 





and explains this study’s data set and prediction 





methodologies. The data set is an internal conflict data 





set containing 13 feature variables of 155 nations from 





1993 to 2003. The proposed methodologies are designed to 


satisfy thesis objective and to improve prediction accuracy. 





Each methodology contains several alternatives that will be 


discussed in detail. 


Chapter III analyzes the results of prediction from 


each prediction method. Like most statistical analysis, 





each of this study’s designed methods will have both a 





training set and a test set to validate the prediction 


values. Because these methods offer a probability 





prediction, the Brier score rule is used to compare 





different prediction methods. The goal is to discover if 
any of this study’s methods can provide a remarkable 


improvement over the existing Shearer’s method [1]. 


Chapter IV concludes this thesis, discusses findings, 





and provides ideas for further study. 
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II. DATA AND METHODOLOGY 


This section presents the data and methodology used to 





predict a nation’s internal conflict. Section A presents 








the data set. Section B describes the variables from the 








data set. Section C introduces the prediction methodology. 


A. DATA SETS 





The data set contains 155 nations observed from 1993 
to 2003 (11 years). For each year, each nation contains 


one conflict indicator and thirteen feature variables. 








These feature variables are used to identify patterns of 





nation state instability or conflict. These include three 


political features (civil liberties, democracy, and 

















political rights); three economic features (male 





unemployment, GDP per capita, and trade openness); six 





social features (adult male literacy, caloric intake, 





ethnic diversity, infant mortality rate, lifer expectancy, 





and religious diversity); and one military feature 





oral 


(conflict history). Each nation-year pair is plotted using 
the 13 feature variables as a point in 13-dimensional space. 


Thus, 155 x 11 = 1705 data points. 





In the raw data set, each conflict indicator indicates 


the nation’s intensity level of conflict in that year. 














There are four levels of intensity: latent crisis, crisis, 
severe crisis, and war. Latent crisis is in level 1; 
crisis is in level 2; and so on. The feature variables 





were obtained from multiple studies by the CAA in the late 
1990s and 2000s; these variables were identified as key 





macro-structural features that affect nation state 





stability in Shearer [1]. 


This thesis uses this same int 


set. For further analysis, the data is divided 











ternational conflict data 





into two 


different groups: independent variables and dependent 


variables. The next section discusses these two groups of 





variables in detail. 


B. VARIABLES DESCRIPTION 


as Independent Variables 





The independent variables 


variables. These featur 








include the 





different sources and measured in different scale. 


are continuous variables and others are discret 





13 feature 


variables were collected from 


Some 


te variables. 


Table 1 describes definitions and sources of the 13 feature 





variables. In the table, 


set, some features have 


it is observed that, 


in the data 


missing values. This is especially 


Significant in the Male Unemployment and Adult Literacy. 





























Category Feature Definition Source’ Percent 
Missing 
Political Civil Liberty A measure of the Freedom 0.41 
freedom of a country’s House 
people to develop 
views, institutions, 
and personal autonomy 
apart from the state. 
Democracy A measure of the Polity 2:50 
degree of democracy. IV 
Project 
Political A measure of the Freedom 0.41 
Rights rights to participate House 
meaningfully in the 
political process. 
Economic Male The percentage of the World 79.40 
Unemployment male labor force that Bank 
is unemployed. 
GDP The annual gross World 4.60 
domestic product per Bank 


person measured in 


8 





























Category Feature Definition Source’ Percent 
Missing 
constant 1998 U.S. 
dollars. 
Trade The ratio of a World 4.10 
Openness country’s total Bank 
imports and exports to 
GDP. 
Social Adult Male The percentage of World 28.20 
Literacy males who can read or Bank 
write -- ages 15 and 
above. 
Caloric An estimate of the FAOUN? 1:29:0 
Intake average number of 
calories consumed per 
person per day. 
Ethnic The population of the CIA WEB 0.65 
Diversity largest ethnic group & CIFPP* 
in the country as a 
percentage of the 
total population. 
0.00 
Infant The number of deaths Us: 
Mortality of children under 1 Bureau 
Rate year of age per 1,000 of the 
live births. Census 
Life The average lif WeSs 0.00 
Expectancy expectancy (males and Bureau 
females combined) of the 
Census 
Religious The population of the CIA WEB 2.50 
Diversity largest religious & CIFPP 
group in the country 
as a percentage of the 
total population. 
Military Conflict The percentage of time HIIK’ 0.00 
History (in years) spent ina 
state of conflict (war 
or severe crisis). 
Table 1. Definitions and Sources of Independent 
Variables. 
i oe All data were collected by the Center for Army Analysis to 


produce Shearer [1]. 


2. FAOUN = Food and Agriculture Organization of the United Nations. 

Sis CIA WFB&CIFPP = CIA World Fact Book and Country Indicators of 
Foreign Policy Project. 

4, HIIK = Heidelberg Institute of Conflict Research Male. 


Figure 1 shows the histogram of the 13 feature 





variables. In Figure 1, it is observed that some variable 


distributions looks normal (i.e., Calories Intake and Civil 








Rights); some variables distribute in a wide range of 








values (i.e., Democracy and Political Rights); and some 


variable distributions are skewed (i.e., Religion Diversion, 








Life Expectance, etc.). The distributions of all 13 


variables are quite distinctive. Since these variables are 





scaled in different measurements, these variables could be 





rescaled to the same measurement for further analysis. 





Section C discusses three such methods in detail. 
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Figure 1. 








Table 2 describes the data summary of the 13 feature 
variables. This table summarizes minimum, maximum, mean 
and standard deviations of the feature variables -- 
excluding the missing value in the data set. In this table, 
it is noticed that replacing a missing value by the mean of 


feature variable can cause a large error if this missing 








value is actually an extreme value. This is especially 


evident for those feature variables with high proportions 





of missing values. 



































Category Feature Min Max Mean StDev 
Civil Liberty dl: 7 3.87 1.76 
Political Democracy -10 10 2.85 6.61 
Political Rights 1 7 3272 1.16 
Male Unemployment 0% 42% 7.89% 5.86% 
Economic GDP 100 59000 6007.5 10348 
Trade Openness 0 290 78.7 40.70 
Adult Male Literacy 19% 100% 81.7% 18% 
Caloric Intake 1500 3800 2648.3 519.80 
Ethnic Diversity 17% 100% 73.1% 2235 
Social Infant Mortality 
0 200 48.4 40.60 
Rate 
Life Expectancy 10 81 64.2 12:20 
Religious Diversity 30 100 79.4 17.20 
Military Conflict History 0% 92.9% 36% 37.00 
Table 2. Data Summary of Independent Variables. 
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26 Dependent Variable 


The dependent variable used is the level, or intensity, 


of conflict experienced by a country in a given year. This 


variable was collected from Heidelberg Institute of 





International Conflict Research (HIIK) by the CAA. 


According to HIIK’s definition [7], conflicts are defined 








as the clashing of interests on national values and issues 





of some duration and magnitude between at least two parties 
(states, groups of states, organizations or organized 
groups). These conflicts include territory, secession, 
decolonization, autonomy, system/ideology, national power, 
regional predominance, international power, resources, and 
others. There are four levels of intensity: latent 


conflict, crisis, severe crisis, and war. In the CAA study, 








Shearer [1] classifies the four levels of intensity into 








two categories: peace and conflict. This thesis adopts the 


same classification. Table 3 summarizes the definition of 





conflict classification of the four levels given in Shearer 











[l]. The level of intensities is defined by HIIK [7] as 


follows: 
Latent Conflict: 


A latent conflict is a positional difference over 
definable values of national meaning - only if demands 
are articulated by one of the parties and perceived by 
the other as such. 


Crisis: 
A crisis is a tense situation in which at least one of 


the parties uses violence forces repeatedly in 
sporadic incidents. 
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Severe Crisis: 


A severe crisis is defined as a state of high tension 
between the parties: either they threaten to resort to 
the use of force or they actually use physical, or 
military, force in an organized way. 











War: 


A war is a violent conflict in which violent force is 
used with certain continuity in an organized and 
systematic way. Depending on the situation, the 
conflict parties exercise extensive measures. The 
extent of destruction is massive and of long duration. 























Table 3 describes how the four levels of intensity are 


further put into two levels of conflict. 


Tv 
Bris Name of CAA’s Level 


Level of Classification 
Intensity 


1 Latent 
Conflict 


Intensity of Conflict 


Table 3. Conflict Classification. 








Cc. METHODOLOGY 


This thesis asserts that in Shearer’s [1] study of 





conflict pattern prediction there is room for improvement. 
The first issue is that Shearer [1] colored a point (13- 
dimensional coordinated by independent variables) based 
upon whether a nation had internal conflict in the same 
year. In predicting the level of conflict, it is 
advantageous to use feature values of the current year to 


predict conflict in the next year. The second issue in 





Shearer’s [1] method is that when predicting the level of 
14 


an unknown conflict, 


the colors of its neighbors are 


weighted equally without considering the distance between 


each neighbor and the location under consideration. In 








their distances or b 


be summarized in fou 


Step 1: Replace 





this thesis, those nearest neighbors are weighed either by 





y their ranks. This study’s method can 


r steps as follows: 


the missing feature data. 





Step 2: Rescal 


Step 3: Rearran 





Step 4: Predict 


These four steps are 





the feature variables. 
ge the data set. 
the conflict probability. 


explained below in detail. 


as Replace the Missing Feature Data 


In the histogra 








distribution in Sect 











ms of independent feature variables 


ion A, it is observed that, for some 


features, there is much missing data -- especially 


male literacy and ma 





adult 


le unemployment. In CAA’s study, the 


missing data is replaced by the mean of the overall feature 





variable. In this thesis, the following rules are used to 
replace the missing data: 
a. For a country that misses all values for one 
feature variable (i.e., Unemployment in 1993-1997 


for Burma), for each year, replace with yearly 


mean of th 
all other 





b. For a coun 
all) data 
the value 
country. 
available 
value (199 

















e entire sample of that variabl 
countries in the same year. 

















try that misses one or more (but not 
points, replace the missing data with 
that was last available for the same 
For example, unemployment was only 


Le across 





for Cameron in 1996; thus, missing 
7-2003) was replaced with the value 








from 1996. 
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2 Rescale the Feature Variables 


In the raw data set, the 13 feature variables are 
measured in different scales. To calculate the Euclidean 


distance between two points, each feature must be scaled 





into the same measurement. This thesis introduces three 
different scaling methods: normalization, standardization, 


and principle components. Each method is explained below. 


a. Normalization 


Normalizing a feature (variable) refers to 


scaling by the minimum and range of the variable, which 








makes the variable score between 0 and 1. Let x;,;,~ be ith 


feature variable which is created for jth nation state for 





kth year. By normalization, each indicator was scaled into 


a new variable yi,j,k between 0 and 1, where 


Xi, j,k —min Xi, j,k 


L 





Vi, jk = 
max i, j.4-~ min Xi, j,k | 


L L 


Vi jk 


b. Standardization 


To standardize a variable is to subtract a 


variable with its mean and then divided this difference by 





the variable’s standard deviation; thus, the standardized 


variable has mean 0 and standard deviation 1. Let xj,4,, be 





the ith feature variable which was created for jth nation 
state for kth past year. By standardization, each variable 
is scaled into a new variable with mean O and standard 


deviation 1, where 
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Xi, j,k —mean Xi, j,k 


L 


std Xi, j,k 
i 





Vij k= 


V i,j,k 


c. Principle Component Analysis (PCA) 


A PCA is a non-parametric method used to identify 





the correlation among the original variables and to reduce 











the dimension of data set that still captures the essenc 








of the data. In PCA, extraction is performed from a set of 
m variables to a set of n factors (m> n). By definition, 
these factors are inferred from the correlations among the 
m variables and each factor is estimated as a weighted sum 


of the m variables. Interested readers can refer to 








Montgomery, Peck and Vining [8] for a discussion on PCA. 


To proceed with PCA, the missing values are 


replaced with additional steps after normalizing (or 





standardizing) the data set. 


Step 1: For feature variables that have more than 
10% missing. 


e Generate a binary indicator for each one of them, 
where the indicator equals 1 if the variable is 
missing and, otherwise, 0O. 





e Replace the missing feature values with yearly 
average of the entire sample. 


Step 2: For feature variables with less than 10% 
missing, replace the value with the missing 
code. When running PCA, the statistical 
software will automatically drop those 
missing values. 














Step 3: For a country’s missing values in the 
predict year, replace it with the first 
existing data from the previous year. 





Step 4: Run the principle component model in S-plus. 
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After missing value is replaced, the new data set 








consists of 21 variables (13 original variables plus 8 











binary variables). From a set of 21 variables, extraction 





is used to obtain a set of underlying variables with fewer 


dimensions. 


This study would have liked to see the 


correlation matrix (Table 4) for a new data set. In the 











correlation matrix, the variables are ALL correlated, but 








there are only a few variables with significant correlation. 





Examples are Calories and Infant Mortality Rate (IMR) and, 


also, IMR and Life Expectance. PCA is useful when there 





are strong correlations among feature variables. In Table 


4, however, it does not appear to be the case. 


18 


\Calories| CivLibers|LifeExp| iDivlEthnicDiv| TradeOp|Unemp| AdultLit | TimeConf| Miss Uemp | MissAdultLit | MisCal | MisDemoc | MissReliDiv | MissEthnicDiv | MissGDP |MissTradeOp 


Calories 
IMR 
PolRights 
CivLiberty 
LifeExp 
Democ 
ReliDiv 
EthnicDiv 
GDP 
TradeOp 
Unemp 
AdultLit 
TimeConf 
MissUemp 
MissAdultLit 
MisCal 
MisDemoc 
MissReliDiv 
MissEthnicDiv 
MissGDP 
MissTradeOp 





























































































































Table 4. Correlation Matrix. The reds indicate that the two variables are highly 
correlated (20.7). 
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Next, the principal components need to be 


extracted. S-plus is used to extract 21 components. This 





will involve solving 21 equations with 21 unknowns. The 
variance in the correlation matrix is transformed into 21 
eigenvalues. Each eigenvalue represents the variance that 
had been captured by one component. Each component is a 
linear combination of the 21 variables. Further, each 


principal component can be viewed as 21-dimensional space 





where each dimension is perpendicular to each other 
dimension. For the conflict data, the importance of 


components is summarized in Table 5. 


Importance of components: 
Comp.1 Comp. 2 Comp. 3 Comp.4 = Comp. Comp.6 Comp. ? Comp. 8 Comp.§ Comp. 10 
Standard deviation 2.2654297 1.4652305 1.26751906 1.19777504 1.14205619 1.07280493 1.03954849 1.01622953 0.98247768 0.90558493 
Proportion of Yariance 0.2443891 0.1022334 0.07650498 0.06831738 O.06210916 0.05480526 0.05146005 0.04917726 0.04596488 0.03905162 
Cumulative Proportion 0.2443891 0.3466225 0.42312747 0.49144486 0.55356402 0.60835927 0.65981932 0. 70899658 0. 75496146 0.79401308 







Comp. 1 Comp. 12 Comp.14 = Comp.18  Comp.16  Comp.1? Comp. 18 Comp. 19 Comp. 20 
Standard deviation 0.89313246 0.82793736 §.79579748 OW 1600737 0. 70135554 0.60536109 O.57675691 0.51053986 0.334315582 0.271984790 
Proportion of Variance 0.03798503 0.03264192 ¥.03015684 002441269 0.02842379 0.01745057 001584041 0.01241195 0.005322234 0.003522654 


Cumulative Proportion 0.83199811 0.86464003 (X8947968? 091920956 094263935 0.96008393 0.97592433 0.98833628 0.9936585168 0997181170 





Comp. 2 
Standard deviation 0.24330112 
Proportion of Variance 0.00281883 
Cumulative Proportion 1.00000000 














Table 5. Importance of Components. 


After extracting from the original 21 features, 
there are 21 important components, which contain 13 feature 


variables and 8 binary variables. 


Thus far in this study, 21 correlated variables 
have been mapped to 21 uncorrelated components by linear 
transformation. A decision is needed to determine how many 
components are required. A Rule of thumb is that there is 


a need for components that capture at least 90% of variance, 
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which, in this study’s case, leaves 13 components. To 
decide on the number of components to retain, another 


device, the scree plot (Figure 2), is used. The plot 





provides a visual aid to decide what components are 


necessary to retain. 


Variances 





Comp.1 Comp.2 Comp.3 Comp.4 Comp.§ Comp.6 Comp.7 Comp.6 Comp.9 Comp.10 


Figure 2. Scree Plot of principal components. Each 
number on the top of bar in the plot indicates 
the proportion of variance. Here, there are 
only 10 components to view, but this plot gives 
the same information as the important 
components indicated above. 





Next, the first three components are used to plot 
the conflict data (Figure 3). The plot seems to be 
clustered for both peace and conflict data. The unknown is 


how it would look in 13-dimensional. Although there is not 








a high correlation among feature variables, the conflict 





cluster plot appears to indicate that it is reasonable to 
use these principal components as the new data set to 


predict the future conflict level. 
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Figure 3. Conflict Cluster Maps. Red indicates conflict; 
blue indicates peace. 


a. Rearrange the Raw Data Set 


In the original raw data set, the 13 feature values of 
a country are used to predict whether that country will 
develop an internal conflict in the same year. To 
illustrate the idea, figures (Figures 4-10) are plotted 
with two feature variables: Political Right and Infant 
Mortality Rate. The location of each point is determined 
by the values of these two feature variables from the year 
written inside the point. The red point indicates 
conflict; whereas, the blue point indicates peace. This is 
based on the conflict-peace status of the year written 
outside the point. In those figures (Figures 4-10), the 
numbers inside the point and outside the point are 


identical. 


Since the internal conflict and the feature variable 
are taken from the same year in the raw data set, there is 


a need to project the future feature variables from the 
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past prior to predicting the future conflict level. For 


example, in Figure 4, if one wanted to predict the conflict 





level in Chile in 2000, then it is necessary to project the 





feature values for Chile in 2000 based on data from 1995 to 





1999. In CAA’s study, they use a statistical extrapolation 





-- Weight Moving Average -- to project the future feature 


variables. Because the future features are projected from 





the past, this method can be used to predict the future as 


far as it is wanted. 





— @ © 
1997 joe 
oO Bhutan Rwanda @ 
e © 
, e” ws 
2 @ 2000 
= @ Albania (2000) 
= Japan 
= Chile @ ro) 
s @ 
E 1994 ‘te 
ra @ Cuba @ @e” 
S 1997 
= @ Chile @ _ 1996 
pa} e @ e . 996) 1995 
“cn @ 
@ @ 
a > 
0 Political Right 1 
Figure 4. Prediction from the raw data set. The two 


feature variables (Political Right and Infant 
Mortality Rate) had been scaled between O and l. 


What is needed is to use feature variables of the 


current year to predict conflict in the next year. This 





can be accomplished directly without projecting a country’s 





future feature values. To do that, the data set must be 
rearranged. The idea is to paint the point with the color 
based on the conflict-peace status from one year later. 
Figure 5 conceptually illustrates how the one-year-out 


23 





prediction looks two dimensionally. The same concept can 


be applied where there is a desire to predict a country’s 





conflict-peace status for two and three years later 


(Figures 6 and 7). 


One Year Out Prediction 


1997 jens 


@ Bhutan Rwanda @ 


Infant Mortality Rate 





0 Political Right 1 
Figure 5. One-Year-Out-Prediction. The points are 


painted with the color based on the conflict- 
peace status from one year later. 


Two Year Out Prediction 


Infant Mortality Rate 





0 Political Right 1 


Figure 6. Two-Year-Out-Prediction. The points are 
painted with the color based on the conflict- 
peace status from two years later. 
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Three Year Out Prediction 


Infant Mortality Rate 





0 Political Right 1 


Figure 7. Three-Year-Out-Prediction. The points are 
painted with the color based on the conflict- 
peace status from three years later. 


By using this data set arrangement, there is no need 








for a statistical extrapolation to predict the future 








feature values. Due to the elimination of the step of 











statistical extrapolation, this method is easy to conduct 














and efficient to obtain the conflict prediction. 


4. Predict the Conflict Probability 


This study uses K-Nearest-Neighbor (KNN) algorithm to 
predict the probability of future conflict for each nation. 


KNN is used to classify a future point according to its 





Euclidean distance from all other past points in the 13- 





dimensional spaces. KNN classifies the future point as a 
function of the n closest past point of one class (peace or 
conflict). In Shearer’s [1] study, he colored the future 


point by equally weighting k closest past points, where k 





is the number of closest points to be chosen. For instance, 


25 


in Figure 8, the k is set to k=5 (five closest neighbors). 
Thus, three out of five points are red (conflict) and the 


chance to be red for that future point in 2000 is 








Infant Mortality Rate 





0 


0 Political Rights 1 


Figure 8. Predict conflict potential by equally weighting 
the closest neighbors. 





The general mathematical form of this classification 





rule can be expressed by 


k 


vc 


P(Conflict) = a 





C, =0,1 


C is the level of conflict of ith’s nearest neighbor 
(O indicates peace, where 1 indicated conflict). 








k = number of the nearest neighbors. 
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In this thesis, two variations of the KNN method are 
studied to predict the conflict probability. In the first 
variation, a country’s future status is predicted by a 


weighted average over its neighbors with each neighbor’s 








weight proportional to the inverse of its distance. For 





instance, in Figure 9, k=5 is set to predict the conflict 
potential in 2000. The status of the closet neighbors (1 
or 0) is summed and weighted by the inverse of distance and 


divided by the totaled weights. Therefore the chance to be 





red for that future point in 2000 is 





(O} % (1/6) + (1) x (1/4) (1) * (1/8) +1) & (177) + (0) & (175) 
(Li oye Cy aye (17 8) (Le 7) aly 5) 


Infant Mortality Rate 





0 Political Rights 1 


Figure 9. Predict conflict potential by the closest 
neighbors weighted by the inverse of distance. 
D# refers to the distance between point # and 
the predicted point. 








The general mathematical form of this classification 


rule, the k nearest neighbors weighted by inverse of 








distance, can be expressed by 
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Wi = 
D, 
C =0,1 
k 
Wx) 
P(Conflict) = = 


Ss 


D; is the Euclidean distance between the it *® nearest 


neighbor and the future feature vector. 














W; is the weight of the ith nearest neighbor. 


In the second variation of the KNN method, a country’s 








future status is predicted by a weighted average over its 


neighbors with each neighbor’s weight proportional to the 








inverse of its rank. The rank is determined by the 


distance: the closer the distance, the lower the rank. For 











instance, in Figure 10, k=5 is set to the rank of the 





closest past points, from 1to 5. 1 is the closest point 


and 5 is the farthest point. To predict the conflict 





potential in 2000, the status of the closet neighbors (1 or 





_ 


0) is summed and weighted by the inverse of rank and divide 
by the totaled weights. Thus, the chance to be red for that 


future point in 2000 is 








CO) CL yar CS eI DY te) CS eC Ly CLO ee. | 
UU a eg ly 27) ck Ile Cg ED atl Ba 
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Infant Mortality Rat 





0 Political Rights 1 


Figure 10. Predict conflict potential by the closest 
neighbors weighted by the inverse of rank. R# 
refers to the rank between point # and the 
predicted point. 





In this classification rule, the k nearest neighbors 





are weighted by inverse of rank. The general mathematical 


form can be expressed by 


W.=— 
R 
C=Gi1 
k 
~,xC,) 
P(Conflict) = = 


Ri is the rank of the ith nearest neighbor among k’s 
nearest neighbor based upon the distance with the 
future feature vector. 





Wi is the weight of the ith’s nearest neighbor. 
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III. ANALYSIS 


Based on this study’s proposed methodology, there are 








three different ways to rescale the raw data and three KNN 


variations to classify the future conflict potential. This 











_ 


gives a total of nine methods (see Table 6). 


















































Classification | Classification | Classification 
Method weighted weighted by weighted by 
Equally 1/Distance 1/Rank 
eas NE Method ND Method NR Method 
ae SE Method SD Method SR Method 
heen PE Method PD Method PR Method 
Components 
Table 6. Method table. The names of methods are the 
combination of the bold letters of methodology 








Due to the availability of the data set, this study 














will conduct conflict predictions in 155 nations up to 





three years. The predicted results will be evaluated by 





the Brier scoring rule. The data set used to validate 





predicted results is from 1998 to 2003. This is, then, used 








to determine which method performs the best according to 





the Brier scoring rule. 


The rest of this chapter is organized as follows. 


Section A discusses the results of one-year-out predictions. 











The data up to the current year is used to predict the 


conflict probability of the next year. Section B conducts 








a comparison of different prediction methods in one-year- 
out prediction. Section C discusses the results in two- 


and three-year-out predictions. Finally, the result of 
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overall performances for 





these three predictions and the 








determination of which method gives the best prediction 





result is discussed. 


A. 


In one-year-out prediction, 


ONE-YEAR-OUT PREDICTION 


this study predicts the 


conflict potential of next year based on the data available 


in the current year. 


training set and a tes 


year. The 


The data set 


is divided into both a 











training se 


predicted year and is 


po 


se 





t used to 





to predict the conflict potential in 1998, 


from 1994 








results in 1998, 


probability 


conflict potential 


In section II 


discussed. 


conflict potential in the future. 





refers 


These neighbors’ 
distance from the predicted point. 
for 11 different values: 


prediction method, 





assessments of 155 nation’s 


tential in the predicted year. 








t set, depending on the predicted 
t is the data set before the 
used to predict the conflict 











The test set is the data 





validate the predicted results. 











to 1997 is used as 


For example, 


the data set 





the training set; the data in 





1998 is used as the test set. 


1998 is either 1 


(conf 


The same 


in 








ws, 


however, 


The conflict occurrences in 





lict) or 0 (peace). The predicted 





are measured in term of 
idea applies in predicting the 


1999, 2000, and so on. 


the KNN algorithm variations were 


These variations are used to predict the 





each k-va 


The k-value of KNN 


to the number of nearest neighbors in the past. 





distances are measured by Euclidean 


This thesis varies k 


1, 2.. 10, and 15. In each 


lue offers a set of probability 


a, 


uture conflict potential. 








To evaluate a probability prediction when there are 


two possible outcomes, 


there are two commonly used proper 
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scoring rules: the Brier scoring rule and the logarithm 





scoring rule [9]. With this study’s prediction methods, as 


well as with those in the Shearer’s [1] report, sometimes a 











conflict probability of 0 is predicted. In these instances, 


using the logarithm scoring rule will result in a score of 





negative infinity. For that reason, this study chooses the 
Brier scoring rule. The Brier scoring rule is defined as 


follows: 


(1-P(Conflict))’ if conflict 





Brier score = 
P(Conflict) if peace 


Since Brier score is a penalty score, a lower score 


indicates a better prediction. This study wants to use the 





Brier score rule to conduct two validations. First, to 





determine which prediction method provides the lowest score 


a = 


(highest accuracy). Second, from the selected method, to 














determine which k-value of KNN variations provides the 


lowest score the prediction of 155 nations’ future conflict 





potential is assessed in all methods. 


B. COMPARISON OF DIFFERENT PREDICTION METHODS 


Figure 11 gives the results of one-year-out prediction 





from 1998 to 2003. In the plot, each curve represents the 





average Brier score over 155 nations between 1998 and 2003 


using one prediction method. 





Recall that this study proposes nine prediction 
methods (Table 6). In addition to the method in Shearer 


[l], there are 10 curves in Figure 11. Shearer’s [1] score 





pattern starts from the lowest score at k=1 and, then, as k 
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increases, the score increases dramatically. The result 


shows that the more neighbors set to predict the conflict 





i. 


potential, the less accurate the result. The patterns of 





this study’s proposed methods start from higher scores at 





k=1, decreases at k=2, and, then, jumps up and down 


irregularly. 


From Figure 11, there are two observations. First, 





among the three rescaling methods, the best methods based 
on the Brier score appear to be those using normalized data 
(0.071), followed by those using standardized (0.077) data, 
and, then, by those using principal components (0.077). 
Thus, the methods using normalized data appear to be the 
best prediction methods because they provide the lowest 


Brier scores. 


Second, for each rescaling method, the lowest score is 





obtained by those methods using the classification rule 








weighted by the inverse of distance. Those methods using 


equally weighted classification always have the highest 





score for each k value. Hence, using the classification 








rule weighted unequally does improve the prediction 


accuracy. Further, according to the score pattern of each 








method, to increase closest neighbors to predict the 








conflict potential of a nation does not provide better 





prediction. It is probably due to density of cluster map 








in 13-dimensional space, there are more alien points exist 








in clusters, and the conflict and peace clusters are not 
identical to each other. Overall, the performance of the 


10 methods based on the Brier score can be ordered, from 





the best to the worst, as follows: Shearer [1], ND, NR, NN, 
SD, SR, SN, PD, PR, and PN methods. 
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According to the Brier score, the methods using 


principal components (PN, PD, and PR methods) are the worst. 





This is because they need at least k=8 to obtain the lowest 








score which is still not the lowest score of all. Using 














principal components to rescale the feature values is also 


a least efficient method. This is because the extra step 





is time-consuming. Due to these drawbacks, this study 
drops the PN, PD, and PR methods in the next two- and 


three-year-out predictions. 








0.1 


—e—ND Method 


—e— NR Method 


—e— NE Method 





















































0.08 —e—SD Method 
io —~e—SR Method 
ad 
5 —~— SE Method 
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0.06 —_~— PD ethod 
——~——PR Method 
eameee PE Method 
= -@ =-Shearer [1] 

0.04 

1 2 3 4 5 6 7 8 9 10 15 
K-VALUE 
Figure 11. Result: One-Year-Out Prediction (1998-2003). 








The lowest score of all methods is given by the 
Shearer method. The lowest score of this 
study’s proposed methods is given by ND method. 
The difference is 0.011. 
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Cc. TWO- AND THREE-YEAR-OUT PREDICTIONS 


In one-year-out prediction, the Shearer [1] method 


outperforms this study’s proposed methods by at least 0.011 


average Brier score for each nation, and provides the best 





prediction only using one closest neighbor to predict a 








nation’s conflict status. In this section, this study 








continues to look at how each of the proposed methods, 





excluding those using principal components, performs in the 


two- and three-year-out predictions. 


In two-year-out prediction, the conflict potential is 





predicted based on the conflict-peace status from two years 





later. Figure 12 shows the predicted results of two-year- 








out prediction. The lowest score of the Shearer [1] method 


is obtained at k=1; whereas, the lowest scores of all this 





study’s methods are provided at k=2. The pattern of 
Shearer’s [1] method increases as k value increases and 


decreases as k>10. In this study’s methods, the score 





patterns start at higher score at k=1; decrease 








dramatically at k=2; and, then, increase very slowly except 





Sass 


for those using equally weighted classification rules. 





Briefly, the score patterns of this study’s methods show 











that increasing k value does not improve the prediction, 








but does show a potential improvement in the Shearer [1] 





method. 


36 






































0.12 
—e—ND Method 
Pe sole a a a a —e—-NR Method 
Z ~~ 
0.1 —_e—- NE Method 
wn 
ie —e— SD Method 
QO 
O 
Y 
0.08 —_e—- SR Method 
—*— SE Method 
= -@e- -Shearer [1] 
0.06 
1 2 3 4 5 8 9 10 15 
K-VALUE 
Figure 12. Result: Two-Year-Out Prediction (1998-2003). 





The best predicted result is give by ND method 
at k =2. This method improves by 0.019 from 
the Shearer method. 





In three-year-out prediction, this study predicts the 


conflict potential based on the conflict-peace status from 





three years later. Figure 13 shows the prediction results. 
The Shearer [1] method does not work as well as this 
study’s methods. Comparing this study’s methods in the 


two-year-out prediction with that of Shearer, the 





approximate same lowest score and the score pattern for 





each method is similar -- except for the lowest score, 
which is provided by SD method at k=3. The difference, 


however, from ND method’s lowest score is marginal. In 





Shearer’s [1] method, its lowest score is not obtained at 


k=1; instead at k=15. It will continue to decrease as k 





increases. When comparing the pattern of SD method with 
the Shearer [1] method, increasing k value in this study’s 


method does not improve SD performance at all, but shows a 





potential improvement in Shearer’s [1] method. 
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Figure 13. Result: Three-Year-Out Prediction (1993-2003). 
The best predicted result is given by SD method 
at k =2. This method improves by 0.019 from 
Shearer’s method. 


Next, this study wants to know which method, by 





utilizing the k value, provides the lowest score as the 





prediction of 155 nations’ conflict potential up to three 


years (validated from 1998 to 2003). To obtain the overall 








performance of each method, this study combines three 
results regarding one- two- and three-year-out predictions. 
The combined result in predicting 155 nation’s conflict 


potential from 1998 to 2003 at k=1, 2..10, and 15 is the 





average score of three results. The results show in Figure 


14. 
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Figure 14. Result: Overall Prediction (1993-2003). The 


best predicted result is given by ND method at 
k=2. This method improves by 0.019 from the 
Shearer method. 


D. DISCUSSION 


Up to now, this study used its proposed methods to 








predict the conflict status in 155 nations from 1998 to 
2003. Three different predictions, one-, two-, and three- 
year-out predictions, were applied to compare the 
prediction results. In each year, k was set for 11 


different values to predict the conflict probability and 





Brier scoring rule was used to evaluate accuracy. This 


study compared its proposed methods with the Shearer’s [1] 





method and determined which method is the most accurate and 


efficient method. 


In one-year-out prediction, this study’s proposed 





methods did not have any improvement over the Shearer’s 





method. Prediction results in this study, however, had 
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some improvement in the two- and three-year-out predictions. 
The overall prediction performance for each method is the 


combined result of three predictions: one-year-out, two- 





year-out, and three-year-out. The result suggests that 








this study’s methods provide better prediction. Among 





these prediction results, the three-year-out prediction 








gives the largest improvement by dropping the average Brier 
score by 0.055. This is followed by the two-year-out 
prediction by 0.02 and, then, followed by the one-year-out 





prediction by -0.01 (negative improvement). It appears 


that this study’s method is more suitable to predict the 





conflict probability further into the future. One possible 


reason is that the moving average method Shearer [1] used 





to project a nation’s 13 statistics does not work well to 


project their values far into the future. 





In one- two- and three-year-out predictions, this 





study validates the predicted results from 1998 to 2003. 





The lowest scores of Shearer’s [1] are 0.06, 0.091, and 
0.109 in three predictions; their responding k values are 1, 
1, and 15. Shearer’s method results show that as more 


year-out prediction is conducted, the predicted errors of 











Shearer’s [1] method increases. It implies that the 
conflict and peace clusters in cluster map are less 
identical to each other, and a larger k value is needed to 
obtain a lower score. Note that in Shearer method, the 


cluster densities in one-, two- and three-year-out 





predictions are the same. The cluster maps for these three 


predictions have the same data points, but their pictures 





are not the same due to the difference of projected feature 
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values. Once the clusters are not identical to each other 
(more alien points in each cluster), it is necessary to use 


more neighbors to obtain a satisfactory accuracy. 


In this study’s methods, the score pattern shows the 











biggest improvement at k=2. This k-value also provides the 
lowest score in the overall prediction. In each year-out 
prediction, the methods which provide the lowest score are 


ND, ND, and SD; their responding k value are 2, 2, and 3; 
and their scores are 0.071, 0.071, and 0.068. This shows 


that as the conflict cluster density decreases, there may 








be a need for a larger k value to get the lowest score, but 
this k value is going to increase very slowly and the 


improvement is not significant. When the data points are 








plotted in the 13-dimensional spaces, the densest conflict 








cluster map is provided by the one-year-out prediction; 





then two- and three-year-out predictions, respectively. 


Thus, this study’s outcomes suggest that even the cluster 





map are getting sparser, the conflict and peace clusters 





are still identical to each other, and a small k value 


still provides a good prediction result. In all, this 








observation shows that k=2 is good if the density of the 


cluster plot is high; whereas, if the density is low, a 





larger value k may be better. 


Among this study’s proposed methods, each method has 


it own score pattern. These score patterns provide good 





knowledge as to which methodology is the best to apply. 





For instance, comparing the prediction result by the 





rescaling method, this study identified that the prediction 
using the normalized data outperforms other rescaling 


methods. Also, this study identifies that those methods 
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using classification rule by the inverse of distance 





provide the highest accuracy. As result from the overall 


prediction, the ND provides the most accurate prediction 





(lowest score). 
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IV. CONCLUSION AND RECOMMENDATION 


In this thesis, several probability prediction methods 








are proposed to predict the future conflict potential in 


155 nations. This study applies these methods to predict 














the conflict potential for up to three years into the 





future. The probability predictions are evaluated using 





the Brier score. The results suggest that, overall, this 
study’s proposed methods give lower scores (higher 


accuracy) in comparison with the Shearer [1] method. 


Nevertheless, different methods are more applicable in 
different scenarios. For instance, the results show that 
Shearer’s method is better for one-year-out prediction; 


however, this study’s methods are better for two- and 





three-year-out predictions. It appears that the 





a 


statistical projections do not work well to project future 





feature variables: the more year-out predictions give more 





projecting errors. Since this study’s methods do not apply 


an extrapolated projecting, when more than two-year-out 





prediction is conducted, the trend pattern for each feature 








variable is well maintained; therefore, the prediction 





error is smaller. 


In addition, the selection of k value also plays an 





essential role to predict the conflict potential, and it 





affects prediction results in different setting. For 











instance, when the density of the cluster is high, 
Shearer’s method provides the best prediction result with a 


small k in one-year-out prediction, but as more years out 





predictions are applied, a larger k may be better. When the 





density is low, such as more years out predictions, this 
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study’s method provides a more accurate prediction with a 





small k value. The objective is to obtain the lowest score 





and k=2 is always a good choice. 


The overall result suggests that the best method to 


obtain the highest accuracy is to use normalized data and 





predict future conflict potential weighted by the inverse 








of distance between the two closest neighbors. On average, 
this method, using the Brier score, improves about 0.02 in 


predicting the future conflict potential for 155 nations 








validating from 1998 to 2003. This best method is also an 








efficient way to predict the future conflict potential. 














This is because there is no need to forecast future 





features or to use many neighbors (k-value) to predict the 





conflict probability. The result shows the improvement. 











Unfortunately, using this study’s proposed methods has 





one disadvantage. Without using a trend function to 


predict the future feature variables, there is prediction 








limitation due to the unavailability of data sets. For 





instance, the current data set cannot be used to predict 





the conflict probability in 2015, like what Shearer did in 
his method. It is because no data is available across 12 


years in this study. 


In closing, what was observed in this thesis study can 


be built upon, explored, and verified with more year-out 





predictions. Further, it is expected that other methods 
can be explored to minimize prediction error and shed 


- 


Significant light on the possibility to predict conflict 











potential by using different concepts and methodologies and 
to provide a more efficient and accurate early warning of 


state failure in the future. 
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